AITopics | cross-lingual language model pretraining

Cross-lingual Language Model Pretraining

Neural Information Processing SystemsDec-25-2025, 22:46:44 GMT

Recent studies have demonstrated the efficiency of generative pretraining for English natural language understanding. In this work, we extend this approach to multiple languages and show the effectiveness of cross-lingual pretraining. We propose two methods to learn cross-lingual language models (XLMs): one unsupervised that only relies on monolingual data, and one supervised that leverages parallel data with a new cross-lingual language model objective. We obtain state-of-the-art results on cross-lingual classification, unsupervised and supervised machine translation. On XNLI, our approach pushes the state of the art by an absolute gain of 4.9% accuracy. On unsupervised machine translation, we obtain 34.3 BLEU on WMT'16 German-English, improving the previous state of the art by more than 9 BLEU. On supervised machine translation, we obtain a new state of the art of 38.5 BLEU on WMT'16 Romanian-English, outperforming the previous best approach by more than 4 BLEU. Our code and pretrained models will be made publicly available.

cross-lingual language model pretraining, machine translation, name change, (2 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Reviews: Cross-lingual Language Model Pretraining

Neural Information Processing SystemsJan-26-2025, 20:11:31 GMT

This paper uses three techniques for incorporating multi-lingual (rather than just mono-lingual) information for pretraining contextualised representations: (i) autoregressive language modelling objective (e.g. The methods are evaluated on four tasks: (i) cross-lingual classification (XNLI), (ii) unsupervised machine translation, (iii) supervised machine translation, and (iv) low-resourcce language modelling. These results are important as they showcase the strong benefit of multi-lingual (rather than just mono-lingual) pretraining for multiple important downstream tasks, and achieve new state of the art. Originality: while the methods are not particularly novel (autoregressive and masked language modelling pretraining have both been used before for ELMo and BERT; this work extends these objectives to the multi-lingual case), the performance gains on all four tasks are still very impressive. The empirical results are strong, and the methodology is sound and explained in sufficient technical details. - Clarity: The paper is well-written, makes the connections with the relevant earlier work, and includes important details that can facilitate reproducibility (e.g. the learning rate, number of layers, etc.).

cross-lingual language model pretraining, empirical result, machine translation, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Reviews: Cross-lingual Language Model Pretraining

Neural Information Processing SystemsJan-26-2025, 20:11:20 GMT

This paper studies the problem of cross lingual language model pretraining. Pros • An important problem is studied. Cons • The proposed methods are not particularly novel. All the reviewers liked the paper.

cross-lingual language model pretraining

Neural Information Processing Systems

Genre: Overview (0.74)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.80)

Add feedback

Cross-lingual Language Model Pretraining

Neural Information Processing SystemsOct-10-2024, 20:44:42 GMT

Recent studies have demonstrated the efficiency of generative pretraining for English natural language understanding. In this work, we extend this approach to multiple languages and show the effectiveness of cross-lingual pretraining. We propose two methods to learn cross-lingual language models (XLMs): one unsupervised that only relies on monolingual data, and one supervised that leverages parallel data with a new cross-lingual language model objective. We obtain state-of-the-art results on cross-lingual classification, unsupervised and supervised machine translation. On XNLI, our approach pushes the state of the art by an absolute gain of 4.9% accuracy.

cross-lingual language model pretraining, machine translation, supervised machine translation

Neural Information Processing Systems

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Cross-lingual Language Model Pretraining

CONNEAU, Alexis, Lample, Guillaume

Neural Information Processing SystemsMar-18-2020, 23:18:36 GMT

Recent studies have demonstrated the efficiency of generative pretraining for English natural language understanding. In this work, we extend this approach to multiple languages and show the effectiveness of cross-lingual pretraining. We propose two methods to learn cross-lingual language models (XLMs): one unsupervised that only relies on monolingual data, and one supervised that leverages parallel data with a new cross-lingual language model objective. We obtain state-of-the-art results on cross-lingual classification, unsupervised and supervised machine translation. On XNLI, our approach pushes the state of the art by an absolute gain of 4.9% accuracy.

cross-lingual language model pretraining, machine translation, supervised machine translation

Neural Information Processing Systems

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Filters

Collaborating Authors

cross-lingual language model pretraining

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Cross-lingual Language Model Pretraining

Reviews: Cross-lingual Language Model Pretraining

Reviews: Cross-lingual Language Model Pretraining

Cross-lingual Language Model Pretraining

Cross-lingual Language Model Pretraining